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FIGURE lA 



Nucleotide Sequence of Human ABCG4 Transporter Gene 
Sequence Range: 1 to 3455 

GCCACCATGG CGGAGAAGGC GCTGGAGGCC GTGGGCTGTG GACTAGGGCC GGGGGCTGTG 60 
GCCATGGCCG TGACGCTGGA GGACGGGGCG GAACCCCCTG TGCTGACCAC GCACCTGAAG 120 
AAGGTGGAGA ACCACATCAC TGAAGCCCAG CGCTTCTCCC ACCTGCCCAA GCGCTCAGCC 180 
GTGGACATCG AGTTCGTGGA GCTGTCCTAT TCCGTGCGGG AGGGGCCCTG CTGGCGCAAA 240 
AGGGGTTATA AGACCCTTCT CAAGTGCCTC TCAGGTAAAT TCTGCCGCCG GGAGCTGATT 300 
GGCATCATGG GCCCCTCAGG GGCTGGCAAG TCTACATTCA TGAACATCTT GGCAGGATAC 360 
AGGGAGTCTG GAATGAAGGG GCAGATCCTG GTTAATGGAA GGCCACGGGA GCTGAGGACC 420 
TTCCGCAAGA TGTCCTGCTA CATCATGCAA GATGACATGC TGCTGCCGCA CCTCACGGTG 480 
TTGGAAGCCA TGATGGTCTC TGCTAACCTG AATCTTACTG AGAATCCCGA TGTGAAAAAC 540 
GATCTCGTGA CAGAGATCCT GACGGCACTG GGCCTGATGT CGTGCTCCCA CACGAGGACA 600 
GCCCTGCTCT CTGGCGGGCA GAGGAAGCGT CTGGCCATCG CCCTGGAGCT GGTCAACAAC 660 
CCGCCTGTCA TGTTCTTTGA TGAGCCCACC AGTGGTCTGG ATAGCGCCTC TTGTTTCCAA 720 
GTGGTGTCCC TCATGAAGTC CCTGGCACAG GGGGGCCGTA CCATCATCTG CACCATCCAC 780 
CAGCCCAGTG CCAAGCTCTT TGAGATGTTT GACAAGCTCT ACATCCTGAG CCAGGGTCAG 840 
TGCATCTTCA AAGGAGTGGT CACCAACCTG ATCCCCTATC TAAAGGGACT CGGCTTGCAT 900 
TGCCCCACCT ACCACAACCC GGCTGACTTC ATCATCGAGG TGGCCTCTGG CGAGTATGGA 960 
GACCTGAACC CCATGTTGTT CAGGGCTGTG CAGAATGGGC TGTGCGCTAT GGCTGAGAAG 1020 
AAGAGCAGCC CTGAGAAGAA CGAGGTGCCT GCCCCATGCC CTCCTTGTCC TCCGGAAGTG 1080 
GATCCCATTG AAAGCCACAC CTTTGCCACC AGCACCCTCA CACAGTTCTG CATCCTCTTC 114 0 
AAGAGGACCT TCCTGTCCAT CCTCAGGGAC ACGGTCCTGA CCCACCTACG GTTCATGTCC 1200 
CACGTGGTTA TTGGCGTGCT CATCGGCCTC CTCTACCTGC ATATTGGCGA CGATGCCAGC 12 60 
AAGGTCTTCA ACAACACCGG CTGCCTCTTC TTCTCCATGC TGTTCCTCAT GTTCGCCGCC 1320 
CTCATGCCAA CTGTGCTCAC CTTCCCCTTA GAGATGGCGG TCTTCATGAG GGAGCACCTC 1380 
AACTACTGGT ACAGCCTCAA AGCGTATTAC CTGGCCAAGA CCATGGCTGA CGTGCCCTTT 1440 
CAGGTGGTGT GTCCGGTGGT CTACTGCAGC ATTGTGTACT GGATGACGGG CCAGCCCGCT 1500 
GAGACCAGCC GCTTCCTGCT CTTCTCAGCC CTGGCCACCG CCACCGCCTT GGTGGCCCAA 1560 
TCTTTGGGGC TGCTGATCGG AGCTGCTTCC AACTCCCTAC AGGTGGCCAC TTTTGTGGGC 1620 
CCAGTTACCG CCATCCCTGT CCTCTTGTTC TCCGGCTTCT TTGTCAGCTT CAAGACCATC 1680. 
CCCACTTACC TGCAATGGAG CTCCTATCTC TCCTATGTCA GGTATGGCTT TGAGGGTGTG 17 4 0 
ATCCTGACGA TCTATGGCAT GGAGCGAGGA GACCTGACAT GTTTAGAGGA ACGCTGCCCG 1800 
TTCCGGGAGC CACAGAGCAT CCTCCGAGCG CTGGATGTGG AGGATGCCAA GCTCTACATG 1860 
GACTTCCTGG TCTTGGGCAT CTTCTTCCTA GCCCTGGGGC TGCTGGCCTA CCTTGTGCTG 1920 
CGTTACCGGG TCAAGTCAGA GAGATAGAGG CTTGCCCCAG CCTGTACCCC AGCCCCTGCA 1980 
GCAGGAAGCC CCCAGTCCCA GCCCTTTGGG ACTGTTTTAA CCTTATAGAC TTGGGCACTG 2040 
GTTCCTGGCG GGGCTATCCT CTCCTCCCTT GGCTCCTCCA CAGGCTGGCT GTCGGACTGC 2100 
GCTCCCAGCC TGGGCTCTGG GAGTGGGGGC TCCAGCCCTC CCCACTATGC CCAGGAGTCT 2160 
TCCCAAGTTG ATGCGGTTTG TAGCTTCCTC CCTACTCTCT CCAACACCTG CATGCAAAGA 2220 
CTACTGGGAG GCTGCTGCCT CCTTCCTGCC CATGGCACCC TCCTCTGCTG TCTGCCTGGG 2280 
AGCCCTAGGC TCTCTAGGGC CCCACTTACA ACTGACCAAA GTGGCCCCCT CTGGGGGTCC 2340 
CCACCACACA AGTGTTTGTA AACTGGGCTG CTATAAGGTT GGAGTTCCAG GGCTGGGCCC 2400 
TGGTGGAGTC CACTGGAAGT CCCATTATGG ATGTTGAAAT GGACAGGGAA GGACTCTGGA 2460 
AGTCTCTTCC TCCTCCTCCT CTTCTCTCCA CCCCTAGACC CTGGCTGACT TGGACMTCT 2520 
GCCAGGACAG AAGCTGGGTT TTCTGTCTAG GTCACCACTC CCAATCCTGG GGATTGGAGA 2580 
GGCCTGGGGC TGTGGGAT.GC CCCATCCCCC TCCCCATCAC CTTTGGTGGG GGCAGGGCCT 2 64 0 
GGTGGCACCT GTGCAATAAT GTCTGTGTTT CTCTCCCACC TGCCACTGGA ACTGGAGAAT 2700 
GCACTTTATT CTGGGCGGGG GGTGAGTGGG GGAAGACCCA ACCCTCCTTT CTCGCTGCCC 2760 
CTAACGCATG CACGGTCTCG TGATGCTCCC TCCCTCTCCG GAGTGACAGG CACATACATG 2820 
AGAACAGGCC ATCTCAGCCC TACACACTTG CCATCCCCTA CAGCACAGAG GAAGAGTGAT 2880 
GGTGGCATGC TGGTGGTGGC GGGTGCTGGT GGGAGGACAG TGCCAACCTC CTCCTGGGGA 2940 
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FIGURE IB 



TCCCATGTTG GAGACTCTAA GGATAAGGCT 
AGGTGTCTAC CCCCAAGTCT TCCCTCCTCC 
CCTGGAGTTC AGGAACCAAC ACAAGCACAA 
CACCCACGGC CCTCCTTTTG TGCTCCATGC 
GCCACTGCTG CTCATTCAAA CTTCTGTCCA 
CCCCTGGGCA TCAGAACAGC CTGCCCTGGG 
CTGGCTTTCC TGGTGGGTCC AGGCTCATTC 
TTTTCCCCCT TTTTCCTGTA CACATCCCTG 
TCCTATCACA CAGGGATGCC AGTTGTATTT 



GGTGCTGCCC AGGGTGTCTA CAGGAACTGC 3000 
CAAGCCAGGG GTGGCACAGG GCACTAGATC 3060 
CCACGGGCAT AAGTTGGCCT TGGCCACTGC 3120 
TGGCATCTTC ACTCCCCTAC CCCTTCCCCA 3180 
TGTCCCTCCA CTGTTCCTAT CAGCAGGTGG 3240 
CACCAGGTGG CAGACACACT CAGAGCATGT 3300 
TGCTTCTGAT TTCCCCTCCC CCAGGGCTCA 3360 
TCTACCTCCT CTCACCCTGC CACAGATTCT 3420 
GTGGG 3455 
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FIGURE 2A 

Coding Sequence of Human ABCG4 Transporter Gene 
Sequence Range: 1 to 3455 

gcc acc atg gcg gag aag gcg ctg gag gcc gtg ggc tgt gga eta ggg ccg ggg get gtg 
Met Ala Glu Lys Ala Leu Glu Ala Val Gly Cys Gly Leu Gly Pro Gly Ala Val 

gcc atg gcc gtg acg ctg gag gac ggg gcg gaa ccc cct gtg ctg acc acg cac ctg aag 
120 

Ala Met Ala Val Thr Leu Glu Asp Gly Ala Glu Pro Pro Val Leu Thr Thr His Leu Lys 

aag gtg gag aac cac ate act gaa gcc cag cgc ttc tec cac ctg ccc aag cgc tea gee 
180 

Lys Val Glu Asn His lie Thr Glu Ala Gin Arg Phe Ser His Leu Pro Lys Arg Ser Ala 

gtg gac ate gag ttc gtg gag ctg tec tat tec gtg egg gag ggg ccc tgc tgg cgc aaa 

240 

Val Asp He Glu Phe Val Glu Leu Ser Tyr Ser Val Arg Glu Gly Pro Cys Trp Arg Lys 

agg ggt tat aag acc ctt etc aag tgc etc tea ggt aaa ttc tgc cgc egg gag ctg att 
300 

Arg Gly Tyr Lys Thr Leu Leu Lys Cys Leu Ser Gly Lys Phe Cys Arg Arg Glu Leu He 

ggc ate atg ggc ccc tea ggg get ggc aag tet aea ttc atg aac ate ttg gea gga tac 
360 

Gly He Met Gly Pro Ser Gly Ala Gly Lys Ser Thr Phe Met Asn He Leu Ala Gly Tyr 

agg gag tet gga atg aag ggg cag ate ctg gtt aat gga agg cea egg gag ctg agg acc 
420 

Arg Glu Ser Gly Met Lys Gly Gin He Leu Val Asn Gly Arg Pro Arg Glu Leu Arg Thr 

ttc cgc aag atg tec tgc tac ate atg caa gat gac atg ctg ctg ccg cac etc acg gtg 
480 

Phe Arg Lys Met Ser Cys Tyr He Met Gin Asp Asp Met Leu Leu Pro His Leu Thr Val 
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FIGURE 2B 

ttg gaa gcc atg atg gtc tct get aac ctg aat ctt act gag aat ccc gat gtg aaa aac 

54 0 

Leu Glu Ala Met Met Val Ser Ala Asn Leu Asn Leu Thr Glu Asn Pro Asp Val Lys Asn 

gat etc gtg aca gag ate ctg aeg gca ctg gge ctg atg teg tgc tee cae acg agg aca 
600 

Asp Leu Val Thr Glu lie Leu Thr Ala Leu Gly Leu Met Ser Cys Ser His Thr Arg Thr 

gcc ctg etc tct gge ggg eag agg aag cgt ctg gcc ate gcc ctg gag ctg gtc aac aac 
660 

Ala Leu Leu Ser Gly Gly Gin Arg Lys Arg Leu Ala He Ala Leu Glu Leu Val Asn Asn 

ccg cct gtc atg ttc ttt gat gag ccc acc agt ggt ctg gat age gcc tct tgt ttc caa 
720 

Pro Pro Val Met Phe Phe Asp Glu Pro Thr Ser Gly Leu Asp Ser Ala Ser Cys Phe Gin 

gtg gtg tec etc atg aag tec ctg gca cag ggg gge cgt ace ate ate tgc ace ate cae 
780 

Val Val Ser Leu Met Lys Ser Leu Ala Gin Gly Gly Arg Thr He He Cys Thr He His 

cag ccc agt gee aag etc ttt gag atg ttt gae aag etc tac ate ctg age cag ggt eag 

840 

Gin Pro Ser Ala Lys Leu Phe Glu Met Phe Asp Lys Leu Tyr He Leu Ser Gin Gly Gin 

tgc ate ttc aaa gga gtg gtc aqe aac ctg ate ccc tat eta aag gga etc gge ttg eat 
900 

Cys He Phe Lys Gly Val Val Thr Asn Leu He Pro Tyr Leu Lys Gly Leu Gly Leu His 

tgc ccc acc tac cae aac ccg get gae ttc ate ate gag gtg gcc tct gge gag tat gga 
960 

Cys Pro Thr Tyr His Asn Pro Ala Asp Phe He He Glu Val Ala Ser Gly Glu Tyr Gly 

gae ctg aac ccc atg ttg ttc agg get gtg eag aat ggg ctg tgc get atg get gag aag 
1020 

Asp Leu Asn Pro Met Leu Phe Arg Ala Val Gin Asn Gly Leu Cys Ala Met Ala Glu Lys 

aag age age cct gag aag aac gag gtc cct gcc cca tgc cct cct tgt cct ccg gaa gtg 
1080 

Lys Ser Ser Pro Glu Lys Asn Glu Val Pro Ala Pro Cys Pro Pro Cys Pro Pro Glu Val 
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FIGURE 2C 

gat ccc att gaa age eae ace ttt gee aee age acc etc aca eag ttc tgc ate etc ttc 

1140 

Asp Pro lie Glu Ser His Thr Phe Ala Thr Ser Thr Leu Thr Gin Phe Cys lie Leu Phe 

aag agg acc tte ctg tec ate etc agg gae aeg gte ctg ace eac eta egg tte atg tec 
1200 

Lys Arg Thr Phe Leu Ser lie Leu Arg Asp Thr Val Leu Thr His Leu Arg Phe Met Ser 

cac gtg gtt att ggc gtg etc ate ggc etc etc tac ctg cat att ggc gac gat gcc age 
1260 

His Val Val He Gly Val Leu He Gly Leu Leu Tyr Leu His He Gly Asp Asp Ala Ser 

aag gte ttc aac aac acc ggc tgc etc ttc ttc tec atg ctg ttc etc atg ttc gcc gcc 
1320 

Lys Val Phe Asn Asn Thr Gly Cys Leu Phe Phe Ser Met Leu Phe Leu Met Phe Ala Ala 

etc atg cea act gtg etc acc tte ccc tta gag atg gcg gte tte atg agg gag cac etc 
1380 

Leu Met Pro Thr Val Leu Thr Phe Pro Leu Glu Met Ala Val Phe Met Arg Glu His Leu 

aac tac tgg tac age etc aaa gcg tat tac ctg gcc aag acc atg get gac gtg ccc ttt 
1440 

Asn Tyr Trp Tyr Ser Leu Lys Ala Tyr Tyr Leu Ala Lys Thr Met Ala Asp Val Pro Phe 

eag gtg gtg tgt ccg gtg gte tac tgc age att gtg tac tgg atg aeg ggc eag ccc get 
1500 

Gin Val Val Cys Pro Val Val Tyr Cys Ser He Val Tyr Trp Met Thr Gly Gin Pro Ala 

gag acc age cgc ttc ctg etc ttc tea gcc ctg gcc acc gcc acc gcc ttg gtg gcc caa 
1560 

Glu Thr Ser Arg Phe Leu Leu Phe Ser Ala Leu Ala Thr Ala Thr Ala Leu Val Ala Gin 

tet ttg ggg ctg ctg ate gga get get tec aac tee eta eag gtg gee act ttt gtg ggc 

1620 

Ser Leu Gly .Leu Leu He Gly Ala Ala Ser Asn Ser Leu Gin Val Ala Thr Phe Val Gly 

cea gtt acc gee ate cet gte etc ttg tte tec ggc tte ttt gte age tte aag acc ate 
1680 
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FIGURE 2D 

Pro Val Thr Ala He Pro Val Leu Leu Phe Ser Gly Phe Phe Val Ser Phe Lys Thr He 

ccc act tac ctg caa tgg age tec tat etc tec tat gtc agg tat ggc ttt gag ggt gtg 
1740 

Pro Thr Tyr Leu Gin Trp Ser Ser Tyr Leu Ser Tyr Val Arg Tyr Gly Phe Glu Gly Val 

ate ctg acg ate tat ggc atg gag ega gga gae ctg aca tgt tta gag gaa cgc tgc ccg 

1800 

He Leu Thr He Tyr Gly Met Glu Arg Gly Asp Leu Thr Cys Leu Glu Glu Arg Cys Pro 

ttc egg gag cea eag age ate etc ega geg ctg gat gtg gag gat gee aag etc tac atg 
1860 

Phe Arg Glu Pro Gin Ser He Leu- Arg Ala Leu Asp Val Glu Asp Ala Lys Leu Tyr Met 

gac ttc ctg gtc ttg ggc ate ttc ttc eta gee ctg egg ctg ctg gee tac ctt gtg ctg 
1920 

Asp Phe Leu Val Leu Gly He Phe Phe Leu Ala Leu Arg Leu Leu Ala Tyr Leu Val Leu 

cgt tac egg gtc aag tea gag aga tag agg ctt gee cea gee tgt ace cea gee cet gca 
1980 

Arg Tyr Arg Val Lys Ser Glu Arg *** 

gca gga age ccc cag tec cag ccc ttt ggg act gtt tta ace tta tag act tgg gca ctg 
2040 

gtt cet ggc ggg get ate etc tec tec ctt ggc tec tec aca ggc tgg ctg teg gac tgc 
2100 

get ccc age ctg ggc tct ggg agt ggg ggc tec age cet ccc cae tat gee cag gag tet 

2160 

tec caa gtt gat geg gtt tgt age ttc etc cet act etc tec aac ace tgc atg caa aga 
2220 

eta ctg gga ggc tgc tgc etc ctt cet gee cat ggc ace etc etc tgc tgt ctg cet ggg 
2280 

age cet agg etc tet agg gee cea ctt aca act gac caa agt ggc ccc etc tgg ggg tee 

2340 

cea cea cae aag tgt ttg taa act ggg ctg eta taa ggt tgg agt tee agg get ggg ccc 
2400 



Title: NOVEL ABCG4 TRANSPORTER AND USES THEREOF 
Inventors: H(^||^ Chen et al. Serial No. Unassigned Do^Mo. 100103.406 

Express Mail No. EL75573 1 1 14US 

FIGURE 2E 

tgg tgg agt cca ctg gaa gtc cca tta tgg atg ttg aaa tgg aca ggg aag gac tct gga 

2460 

agt etc ttc etc etc etc etc ttc tct cca ccc eta gac cet ggc tga ctt gga caa tct 
2520 

gee agg aca gaa get ggg ttt tct gtc tag gtc acc act ccc aat cet ggg gat tgg aga 
2580 

ggc ctg ggg ctg tgg gat gcc cca tec ccc tee cca tea cet ttg gtg ggg gea ggg cet 

2640 

ggt ggc acc tgt gca ata atg tct gtg ttt etc tec cac ctg cca ctg gaa ctg gag aat 
2700 

gca ctt tat tct ggg egg ggg gtg agt ggg gga aga ccc aae cet cet ttc teg ctg ccc 
2760 

eta aeg eat gca egg tct cgt gat get ccc tee etc tec gga gtg aca ggc aca* tac atg 
2820 

aga aca ggc cat etc age cet aca cac ttg cca tec cet aca gca cag agg aag agt gat 
2880 

ggt ggc atg ctg gtg gtg gcg ggt get ggt. ggg agg aca gtg cca ace tee tec tgg gga 
2940 

tec eat gtt gga gac tct aag gat aag get ggt get gee cag ggt gtc tac agg aae tge 
3000 

agg tgt eta ccc cca agt ctt ccc tee tee caa gee agg ggt ggc aca ggg cac tag ate 

3060 

cet gga gtt cag gaa cca aca caa gca caa cca egg gca taa gtt ggc ctt ggc cac tge 
3120 

cac cca egg ccc tec ttt tgt get cca tge tgg cat ctt cac tec cet acc cet tec cca 
3180 

gcc act get get eat tea aae ttc tgt cca tgt ccc tee act gtt cet ate age agg tgg 

3240 

ccc ctg ggc ate aga aca gcc tge cet ggg cac cag gtg gca gac aca etc aga gca tgt 
3300 

ctg get ttc ctg gtg ggt cca ggc tea ttc tge ttc tga ttt ccc etc ccc cag ggc tea 
3360 

ttt tee ccc ttt ttc ctg tac aca tee ctg tct ace tec tct cac cet gcc aca gat tct 

3420 

tec tat cac aca ggg atg cca gtt gta ttt gtg gg 3455 



Title: NOVEL ABCG4 TRANSPORTER AND USES THE^OF 
Inventors: Ho^iin Chen et al. Serial No. Unassigned Do^Mo. 100103.406 

Express Mail No. EL75573 1 1 14US^F 



1^ 



FIGURE 3 

Predicted Protein Sequence of Human ABCG4 Transporter 

MAEKALEAVGCGLGPGAVAMAVTLEDGAEPPVLTTHLKKVENHITEAQRF 50 
SHLPKRSAVDIEFVELSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGI 100 



M lGPSGAGKSTjFMNILAGYRESGMKGQILVNGRPRELRTFRKMSCYIMQDD 150 



MLLPHLTVLE7\MMVSANLNLTENPDVKNDLVTEILTALGLMSCSHTRTAL 200| 



iLSggQRI^RLAIALELVNNPPlMMlEieEE'aSGLDSASCFQVVSLM^ 250 



RTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCP 300 



ITYHNPADFIIEVASGEYGDLNPMLFRAVQNGLCAMAEKKSSPEKNEVPAP 350 



CPPCPPEVDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHV 4 00 



IVIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEM 4 50 



AVFMREHLNYWYSLKAYYLAKTMADVPFQVVCPVVYCSIVYWMTGQPAET 500 
ISRFLLFSALATATALVAQSLGLLIGAASNSLQVATFVGPVTAIPVLLFSG 550| 



IFFVSFKTIPTYLQWSSYLSYVRYGFEGVILTIYGMERGDLTCLEERCPFR 



600 



Q lEPQSILRALDVEDAKLYMDFLVLGIFFLALRLLAYLVLRYRVKSER 



"646| 



Transmembrane domains are underlinel 



Walker A C signature 



Walker B 
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FIGURE 4 A 



ClustalW Multiple Sequence Alignment of the Members of the ABCG Subfamily 



ABCGl 



77 



ABCG4 



HI 



"T MAAFSVGTAMNASSYSAEMTEPKSVCVSVDEVVSSNMEATETDLLNGHLKKVDNNLTEAQRFSSLPRRAAVNIEFRDI 



1 MAEKALEAVGCGLGPGAVAMAVT LEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSAVDIEFVE| 



ABCG2 1 MSSSNVEVFIP^ VSQGNTNGFPATVSN-— DLKAFTEGAVLSFHNICYR- 4 5| 



ABCG5 



MGDLSSLTPGGSMGLQVNRG- 



■SQSSLEGAPATAPEP- 



-HSLGILHASYSVSHR- 



ABCG8 



1 MAGKAAEERGLPKGATPQDTSGLQDRLFS SESDNSLYFTYSGQPNTLEVR— DLNYQVDLASQVPWFEQLAQI 



ABCGl 



ABCG4 



1431 
ABCG2 



7 8 LS YS VPEGPWWRKKGYKTLLKGI SGKFNSGELVAIMM&SA<SRS^^ 



55 



66 LSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGIMSBgg^VGI^ 



ABCG5 



1291 
ABCG8 



46 VKLKSGFLPCR-KPVEKEILSNINGIMKPG-LNAIL:GPT.Gg.GKS.S.LLDVLAARKDPSG-LSGDVLINGAPR-PA 
51 VRPWWDITSCR-QQWTRQILKDVSLYVESGQIMCILiSSrag^OKT,!^ 



71 FKMPWTSPSCQ--NSCELGIQNLSFKVRSGQMLAIIg$lMi5MiLLDVITGRGHGGKIKSGQIWINGQPSSPQLVRKC^ 



IaBCGI 1 5 6 CYIMQDDMLLPHLTVQEAMMVSAHLKLQ— EKDEGRREMVKEILTALGLLSCANTRTGS (LTSlggC^BK'EaiAIALELVl 

|aBCG4 144 CYIMQDDMLLPHLTVLEAMMVSANLNLT— ENPDVKNDLVTEILTALGLMSCSHTRTAL LSGGQRKRLAIALELV| 

^ : ^ . , 

|ABCG2 122 GYVVQDDVVMGTLTVRENLQFSAALRLATTMTNHEKNERINRVIEELGLDKVADSKVGTQFIRGVSGGERKRTSIGMELIj 

^ ^ , 

IaBCGS • 130 SYVLQSDTLLSSLTVRETLHYTALLAIRR-GNPGSFQKKVEA\mAELSLSHVADRLIGNYSLGG.ir^S|rG£RRRVSIAAQLL^ 

|208| ^ , , 

|aBCG8 149 AHVRQHNQLLPNLTVRETLAFIAQMRLPRTFSQAQRDKRVEDVIAELRLRQCADTRVGNMYVRGIiSQgEBRg;VSIGVQLL| 



IaBCGI 229 NNPP^lteEijllSGLDSASCFQWSLMKGIAQGGRSIICTIHQPSAKLFELFDQLYV 

g08| 

|aBCG4 217 NNPPpiiPiPlSGLDSASCFQVVSLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIP 



202 TDPSp:'fiS;L^D,£ETiTGLDSSTANAVLLLLKRMSKQGRTIIFSIHQPRYSIFKLFDSLTLLASGRLMFHGPAQEA^^^ 
209 QDPiqpilliPaTGLDCMTANQIVVLLVEIARRNRIVVLTIHQPRSELFQLFDKIAILSF 



BCG2 



|28ll 
lABCGS 



229 WNP&iirM00.P:i|SGLDSFTAHNLVKTLSRLAKGNRLVLISLHQPRSDIFR^ 




L 



lABCGl 309 LNCPTYHNPADFVMEVASGEYGD— QN-SRLVRAVREGMCDSDHKRDLGGDAEVNPFLWHRPSEEVKQTKRLKGLRKPS^ 

^ - , 

|ABCG4 297 LHCPTYHNPADFIIEVASGEYGD--LN-PMLFRAVQNGLCAMAEK KSSPEKNEVPAPCPPCPPE-| 

1357] 

IABCG2 • 282 yhceaynnpadffldiingdstavalnreedfkateiiepskqdkp-lieklaeiyvnssfyketkaelhqlsggekkkk| 



Inventors: 
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IaBCGS 289 YPCPEHSNPFDFYMDLTSVDTQSKERE-IETSKRVQMIESAYKKS AICHKTL KNIERMKHLKTLP-I 

^G8 309 YPCPRYSNPADFYVDLTSIDRRSREQE-LATREKAQSLAALFLEKVR DLDDFLWKAETKDLDED TCVESSVTPL^ 



SSMEGCHSFSASCLTQFCILFKRTFLSIMRDSVLTHLRITSHIGIGLLIGLLYLGIGNETKK— VLSNSGFLFFSMLFL?^ 



ABCGl 385 



q62] \ , , 3 

SbcG4 358 VDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVVIGVLIGLLYLHIGDDASK— VFNNTGCLFFSMLFLMj 



I»i35| ^ ^ : , 

ABCG2 361 ITVFKEISYT TSFCHQLRWVSKRSFKNLLGNPQASIAQIIVTVVLGLVIGAIYFGLKNDSTG ->IONRAGVLFFLTTNQq 



|438| , ^ 

SbcGS — 353 MVPFKTKD'SPGVFSKLGVLLRRVTRNLVRNKLAVITRLLQNLIMGLFLLFFVLRVRSNVLKGAIQDRVGLLYQFVGATPI 



ABCG8 383 TNCLPSPTKMP' 



GAVQQFTTLIRRQISNDFRDLPTLLIHGAEACLMSMTIGFLYFGHGSIQLS— FMDTAALLFMIGALIPl 



460| , I 



ABCGl 463 faalmptvltfplemgvflrehlnywyslkayylaktmadvpfqimfp-vaycsivywmtsqpsdavrfvlfaalgtmtsI 



54l| ' ■__ , 

ABCG4 436 faalmptvl tfplemavfmrehlnywyslkayylaktmadvpfqvvcp-vvycsivywmtgqpaetsrf llfsalatataI 



ABCG2 439 fssvs-avelf vvekklfiheyisgyyrvssyflgkllsdllpmrmlpsiiftcivyfmlglkp k adaffvmmftlmmva| 

3 a 

lAB CGS 432. ytgmlnavn lfpvlravsdqesqdglyqkwqmmlayalhvlpfswat-mifssvcywtlglhpevarf gyfsaallaph| 



ABCG8 461 fnvildvisk cyseramlyyeledglyttgpyffakilgelpehcayi-iiygmptywlanlrpglqpfllh f llvwlvvi 



ABCGl 


542 


lvaqslgllig-aastslqvatfvgpvtaipvllfsgffvsfdtiptylqwmsyisyvrygfegvilsiyg L| 


6121 






ABCG4 


515 


LVAOSLGLLIG-AASNSLOVATFVGPVTAIPVLLFSGFFVSFKTIPTYLQWSSYLSYVRYGFEGVILTIYG M| 


5851 


ABCG2 


518 


YRAS55MALAIA-AGOSVVSVATLLMTICFVFMMIFSGLLVNLTTIASWLSWLQYFSIPRYGFTALQHNEFLGQNFCPGLN1 


Hi • -, 


ABCG5 


511 


ligefltlvllgivqnpnivnsvvallsiagvlvgsgflrniqempipfkiisyftfqkycseilvvnefygln FTq 


587| 






ABCG8 


540 


fccrimalaaa-allptfhmasffsnalynsfylaggfminlsslwtvpawiskvsflrwcfeglmkiqfs R| 


lisl , 



lABCGl 


613 


dredlhcdidetchfq-kseailreldvenaklyldfivlgiffislrliaylvlrykiraer 


674| 


|ABCG4 


586 


ergdltc-leercpfr-epqsilraldvedaklymdflvlgifflalrllaylvlryrvkser 


64 6| 


1ABCG2 


597 


ATGNNPC-NYATCTG--EEYLVKQGIDLSPWGLWKNHVALACMIVIFLTIAYLKLLFLKKYS 


655| 


|ABGG5 


588 


gssnvsvttnpmcaftqgiqfiektcpgatsrftmnflilysfipalvilgivvfkirdhlisr 


65l| 


IABCG8 


611 


RTYKMPLGNLTIAVS— GDKILSVMELDSYPLYAIYLIVIGLSGGFMVLYYVSLRFIKQKPSQDW 


67 3| 
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FIGURES 



IClustalW Multiple Sequence Alignment of Partial Human ABCG4 Transporter in GenBan^ 



(AN: CAC17140) and Human ABCG4 Transporter of this Invention 



ABCG4VS.CAC17140 



BCG4 1 MftEKftLEAVGCGLGPGAVflMAVTLEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSAVDIEFVELSYSVREGPCWRKRGl 



ftC17140 



MftVTLEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSftVDIEFVELSYSVREGPCWRKRGi 



lleol .. 

|CAC17140 62 YKTLLKCLSGKFCRRELliSI>H6gSMGKSaFMNILAGYRESGMKGQILVNGRPRELRTFRKMSCYIMQDDm 
^ :::::.::::::::::::::: 7rTTTTTT71 



|aBCG4 161 AMMVSANLNLTENPDVKNDLVTEILTALGLMSCSHTRTALjSGGQRKRLAIALELVNNPP^B^ 
|cAC17140 142 AMMVSANLKLSEKQEVKKELVTEILTALGLMSCSHTRTALIiSiGgQR!^^ 

P „ r iTT^g 



ABCG4 241 slmkslaqggrtiictihqpsaklfemfdklyilsqgqcifkgwtnlipylkglglhcptyhnpadfiievasgeygdU 



CAC17140 222 SLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCPTYHNPADFIIEVASGEYGDi] 



^ , 



1abcg4 321 npmlfravqnglcamaekksspeknevpapcppcppevdpieshtfatstltqfcilfkrtflsilrdtvlthlrfmshvI 



14001 ^ . 

|CAC17140 302 NPMLFRAVQNGLCAMAEKKSSPEKNEVPAPCPPCPPEVDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVl 

IssTI , 



|aBCG4 401 VIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQVl 

^ [ . . 3 

|CAC17140 382 VIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQV| 

|46i| ■ 

I ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

|ABCG4 481 VCPVVYCSIVYWMTGQPAETSRFLLFSALATATALVAQSLGLLIGAASNSLQVATFVGPVTAIPVLLFSGFFVSFKTIPTi 



ICAC17140 4 62 VCPVVYCSIVYWMTGQPAETSRFLLFSRLATATALVAQSLGLLIGAASNSLQVATFVGPVTAIPVLLFSGFFVSFKTIPT| 

: : ^ 



IABCG4 561 ylqwssylsyvrygfegviltiygmergdltcleercpfrepqsilraldvedaklymdflvlgifflalrllaylvlryI 



ICAC17140 542 ylqwssylsyvrygfegviltiygmergdltcleercpfrepqsilraldvedaklymdflvlgifflalrllaylvlry| 

m , 

1 ::::::::::::::::::::;:;: ::::::::=::::::::::::::::::::: :::::::::=::::::::::::::^ 

IABCG4 641 RVKSER 64 6| 
ICAC17140 622 RVKSER 627| 
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Figure 6 
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Figure 7 
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Directory: Mapleleaf/target clone/ABCG family/abcg4/g4cdna (Assembly file) 



G4-clone nucleotide sequence range 1 to 2687 

taccgagctcggatccactagtccagtgtggtggaattgcccttgccaccSIgcggagaaggcgctggagg 

CCGTGGGCTGTGGACTAGGGCCGGGGGCTGTGGCCATGGCCGTGACGCTGGAGGACGGGGCGGAACCCCCTG 
TGCTGACCACGCACCTGAAGAAGGTGGAGAACCACATCACTGAAGCCCAGCGCTTCTCCCACCTACCCAAGC 
GCTCAGCCGTGGACATCGAGTTCGTGGAGCTGTCCTATTCCGTGCGGGAGGGGCCCTGCTGGCGCAAAAGGG 
GTTATAAGACCCTTCTCAAGTGCCTCTCAGGTAAATTCTGCCGCCGGGAGCTGATTGGCATCATGGGCCCCT 
CAGGGGCTGGCAAGTCTACATTCATGAACATCTTGGCAGGATACAGGGAGTCTGGAATGAAGGGGCAGATCC 
TGGTTAATGGAAGGCCACGGGAGCTGAGGACCTTCCGCAAGATGTCCTGCTACATCATGCAAGATGACATGC 
TGCTGCCGCACCTCACGGTGTTGGAAGCCATGATGGTCTCTGCTAACCTGAAGCTGAGTGAGAAGCAGGAGG 
TGAAGAAGGAGCTGGTGACAGAGATCCTGACGGCACTGGGCCTGATGTCGTGCTCCCGCACGAGGACAGCCC 
TGCTCTCTGGCGGGCAGAGGAAGCGTCTGGCCATCGCCCTGGAGCTGGTCAACAACCCGCCTGTCATGTTCT 
TTGATGAGCCCACCAGTGGTCTGGATAGCGCCTCTTGTTTCCAAGTGGTGTCCCTCATGAAGTCCCTGGCAC 
AGGGGGGCCGTACCATCATCTGCACCATCCACCAGCCCAGTGCCAAGCTCTTTGAGATGTTTGACAAGCTCT 
ACATCCTGAGCCAGGGTCAGTGCATCTTCAAAGGCGTGGTCACCAACCTGATCCCCTATCTAAAGGGACTCG 
GCTTGCATTGCCCCACCTACCACAACCCGGCTGACTTCATCATCGAGGTGGCCTCTGGCGAGTATGGAGACC 
TGAACCCCATGTTGTTCAGGGCTGTGCAGAATGGGCTGTGCGCTATGGCTGAGAAGAAGAGCAGCCCTGAGA 
AGAACGAGGTCCCTGCCCCATGCCCTCCTTGTCCTCCGGAAGTGGATCCCATTGAAAGCCACACCTTTGCCA 
CCAGCACCCTCACACAGTTCTGCATCCTCTTCAAGAGGACCTTCCTGTCCATCCTCAGGGACACGGTCCTGA 
^ CCCACCTACGGTTCATGTCCCACGTGGTTATTGGCGTGCTCATCGGCCTCCTCTACCTGCATATTGGCGACG 
O ATGCCAGCAAGGTCTTCAACAACACCGGCTGCCTCTTCTTCTCCATGCTGTTCCTCATGTTCGCCGCCCTCA 
a TGCCAACTGTGCTCACCTTCCCCTTAGAGATGGCGGTCTTCATGAGGGAGCACCTCAACTACTGGTACAGCC 
(fi TCAAAGCGTATTACCTGGCCAAGACCATGGCTGACGTGCCCTTTCAGGTGGTGTGTCCGGTGGTCTACTGCA 
m GCATTGTGTACTGGATGACGGGCCAGCCCGCTGAGACCAGCCGCTTCCTGCTCTTCTCAGCCCTGGCCACCG 
% CCACCGCCTTGGTGGCCCAATCTTTGGGGCTGCTGATCGGAGCTGCTTCCAACTCCCTACAGGTGGCCACTT 
TTGTGGGCCCAGTTACCGCCATCCCTGTCCTCTTGTTCTCCGGCTTCTTTGTCAGCTTCAAGACCATCCCCA 
CTTACCTGCAATGGAGCTCCTATCTCTCCTATGTCAGGTATGGCTTTGAGGGTGTGRTCCTGACGATCTATG 



GCATGGAGCGAGGAGACCTGACATGTTTAGAGGAACGCTGCCMGTTCCGGGAGCCACAGAGCATCCTCCGAG 
CGCTGGATGTGGAGGATGCCAAGCTCTACATGGACTTCCTGGTCTTGGGCATCTTCTTCCTAGCCCTGCGGC 



© TGCTGGCCTACCTTGTGCTGCGTTACCGGGTCAAGTCAGAGAGAffMAGGCTTGCCCCAGCCTGTACCCCAG 
Q CCCCTGCAGCAGGAAGCCCCCAGTCCCAGCCCTTTGGGACTGTTTTAACCTTATAGACTTGGGCACTGGTTC 
B CTGGCGGGGCTATCCTCTCCTCCCTTGGCTCCTCCACAGGCTGGCTGTCGGACTGCGCTCCCAGCCTGGGCT 
U CTGGGAGTGGGGGCTCCAGCCCTCCCCACTATGCCCAGGAGTCTTCCCAAGTTGATGCGGTTTGTAGCTTCC 
U TCCCTACTCTCTCCAACACCTGCATGCAAAGACTACTGGGAGGCTGCTGCCTCCTTCCTGCCCATGGCACCC 
TCCTCTGCTGTCTGCCTGGGAGCCCTAGGCTCTCTAGGGCCCCACTTACAACTGACCAAAGTGGCCCCCTCT 
KGGGGTCCCCACCACACAAGTGTTTGTAAACTGGGCTGCTATAAGGTTGGAGTTCCAGGGCTGGGCCCTGGT 
GGAGTCCACTGGAAGTCCCATCATGGATGTTGAAATGGACAGGGAAGGACTCTGGAAGTCTCTTCCTCCTCC 
TCCTCTTCTCTCCACCCCTAGACCCTGGCTGACTTGGACAATCTGCCAGGACAGAAGCTGGGGTTTTCTGTC 
TAGGTCACCACTCCCAATCCTGGGGGRTTGGAGRGGCCTGGGGSTGTGGGRTGSCCCATCCCCCTCCCCATC 

ACCTTTGGTGGGGGSAGGGCCTG 



Figure 10 
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G4-clone polypeptid sequence range 1 to 646 

MAEKALEAVGCGLGPGAVAMAVTLEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSAVD 
lEFVELSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGIMGPSGAGKSTFMNILAGYRE 
SGMKGQILVNGRPRELRTFRKMSCYIMQDDMLLPHLTVLEAMMVSANLKLSEKQEVKKEL 
VTEILTALGLMSCSRTRTALLSGGQRKRLAIALELVNNPPVMFFDEPTSGLDSASCFQVV 
SLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCP 
TYHNPADFIIEVASGEYGDLNPMLFRAVQNGLCAMAEKKSSPEKNEVPAPCPPCPPEVDP 
lESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVVIGVLIGLLYLHIGDDASKV 
FNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQV 
VCPVVYCSIVYWMTGQPAETSRFLLFSALATATALVAQSLGLLIGAASNSLQVATFVGPV 
TAIPVLLFSGFFVSFKTIPTYLQWSSYLSYVRYGFEGVXLTIYGMERGDLTCLEERCXFR 
EPQSILRALDVEDAKLYMDFLVLGIFFLALRLLAYLVLRYRVKSER 



Figure 1 1 



